blobovnicza: Add benchmark to test different tree settings #2457

cthulhu-rider · 2023-07-25T14:08:00Z

current results:

goos: linux
goarch: amd64
pkg: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/blobstor/blobovniczatree cpu: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz
BenchmarkBlobovniczas_Put/tree=1x0-8                 100          53517515 ns/op         4609697 B/op        109 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                100          53417156 ns/op         5213543 B/op        119 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                 100          52124287 ns/op         5117641 B/op        158 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                 100          37065895 ns/op         2129467 B/op        188 allocs/op

according to test, tree management doesn't really help when number of layers is small. 4x4 showed better performance which is suspicious: right now it may be expected since we didn't try to optimize working with single DB instance as described in #2453.

P.S. in current test implementation, it's not obvious do we force to switch to other DBs or not. I tried to reach this using -benchtime=100x where 100 is fullSizeLimit/singleObjectSize.

codecov · 2023-07-25T14:11:54Z

Codecov Report

Merging #2457 (55243c8) into master (7b86fa2) will increase coverage by 0.02%.
Report is 5 commits behind head on master.
The diff coverage is 100.00%.

❗ Current head 55243c8 differs from pull request most recent head d22c757. Consider uploading reports for the commit d22c757 to get more accurate results

@@            Coverage Diff             @@
##           master    #2457      +/-   ##
==========================================
+ Coverage   29.44%   29.46%   +0.02%     
==========================================
  Files         399      399              
  Lines       30385    30392       +7     
==========================================
+ Hits         8946     8955       +9     
+ Misses      20696    20694       -2     
  Partials      743      743

Files Changed	Coverage Δ
pkg/local_object_storage/blobstor/put.go	`88.46% <100.00%> (+4.25%)`	⬆️

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

roman-khimov · 2023-07-25T14:45:01Z

Here:

BenchmarkBlobovniczas_Put/tree=1x0-8                  57          27046852 ns/op         4786905 B/op        113 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  36          32247503 ns/op         4681081 B/op        109 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  40          31945389 ns/op         4855398 B/op        110 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 34          33063479 ns/op         5213628 B/op        120 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 37          33285487 ns/op         5195977 B/op        119 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 45          33081278 ns/op         5128587 B/op        120 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  44          30096345 ns/op         5198224 B/op        150 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  70          19756034 ns/op         4950343 B/op        151 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  74          20097768 ns/op         5203916 B/op        153 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  74          17488102 ns/op         2129200 B/op        183 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  70          16829594 ns/op         2128992 B/op        182 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  74          18119955 ns/op         2129311 B/op        187 allocs/op

But it's intended for small objects, so I've made:

--- a/pkg/local_object_storage/blobstor/blobovniczatree/put_test.go
+++ b/pkg/local_object_storage/blobstor/blobovniczatree/put_test.go
@@ -47,7 +47,7 @@ func benchmarkPutMN(b *testing.B, depth, width uint64) {
                nBlobovniczas *= width
        }
 
-       const objSizeLimit = 1 << 20
+       const objSizeLimit = 1 << 12
        const fullSizeLimit = 100 << 20
 
        bbcz := NewBlobovniczaTree(
@@ -66,6 +66,7 @@ func benchmarkPutMN(b *testing.B, depth, width uint64) {
                RawData: make([]byte, objSizeLimit),
        }
 
+       rand.Read(prm.RawData)
        b.ReportAllocs()
        b.ResetTimer()
 
@@ -74,7 +75,6 @@ func benchmarkPutMN(b *testing.B, depth, width uint64) {
        for i := 0; i < b.N; i++ {
                b.StopTimer()
                prm.Address = oidtest.Address()
-               rand.Read(prm.RawData)
                b.StartTimer()
 
                _, err = bbcz.Put(prm)

And got:

BenchmarkBlobovniczas_Put/tree=1x0-8                  87          12323530 ns/op           33656 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  92          13031141 ns/op           33912 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  84          12944640 ns/op           33466 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 93          12600201 ns/op           33818 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 88          12914871 ns/op           33901 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 88          12480286 ns/op           33302 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  81          12740758 ns/op           34692 B/op        111 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  93          13036401 ns/op           34883 B/op        111 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  92          13253987 ns/op           35103 B/op        111 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  91          13660381 ns/op           37916 B/op        179 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  88          15612270 ns/op           37448 B/op        179 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  85          15245248 ns/op           38210 B/op        180 allocs/op

Then const objSizeLimit = 1 for the fun of it:

BenchmarkBlobovniczas_Put/tree=1x0-8                  84          13757404 ns/op           12682 B/op         68 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  80          14032831 ns/op           12531 B/op         68 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  82          13922973 ns/op           12701 B/op         68 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 94          14235156 ns/op           12628 B/op         69 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 85          14489805 ns/op           12677 B/op         68 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 81          14262814 ns/op           12712 B/op         68 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  81          14078772 ns/op           12554 B/op         99 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  88          14096657 ns/op           12192 B/op        100 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  96          14447775 ns/op           12224 B/op         99 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  93          14432768 ns/op           20945 B/op        164 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  93          13946014 ns/op           20724 B/op        164 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  81          14332493 ns/op           21495 B/op        167 allocs/op

4×4 tests take noticeably more time to initialize, probably related to #2215.

roman-khimov · 2023-07-25T15:04:17Z

And for everyone who loves Bolt's Batch(), 4K rawdata again:

BenchmarkBlobovniczas_Put/tree=1x0-8                  79          13972488 ns/op           32855 B/op         76 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  75          14311804 ns/op           32893 B/op         76 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  70          14741192 ns/op           33038 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 74          21875172 ns/op           33024 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 51          19937993 ns/op           33017 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 60          19911600 ns/op           33390 B/op         77 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  57          19892171 ns/op           34676 B/op        112 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  55          20462990 ns/op           35634 B/op        112 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  51          20171586 ns/op           35801 B/op        112 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  74          16178143 ns/op           37526 B/op        177 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  82          15692310 ns/op           38175 B/op        180 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  70          15463406 ns/op           38270 B/op        181 allocs/op

And now 🪄:

--- a/pkg/local_object_storage/blobovnicza/put.go
+++ b/pkg/local_object_storage/blobovnicza/put.go
@@ -52,7 +52,7 @@ func (b *Blobovnicza) Put(prm PutPrm) (PutRes, error) {
        bucketName := bucketForSize(sz)
        key := addressKey(prm.addr)
 
-       err := b.boltDB.Batch(func(tx *bbolt.Tx) error {
+       err := b.boltDB.Update(func(tx *bbolt.Tx) error {
                if b.full() {
                        return ErrFull
                }

Which leads to

BenchmarkBlobovniczas_Put/tree=1x0-8                 702           2028573 ns/op           34427 B/op         78 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                 666           1835332 ns/op           34542 B/op         78 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                 670           1910506 ns/op           34466 B/op         78 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                649           2054211 ns/op           34187 B/op         78 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                633           1892979 ns/op           34329 B/op         78 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                662           2001137 ns/op           34367 B/op         78 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                 556           1927446 ns/op           34935 B/op        103 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                 634           1939887 ns/op           34880 B/op        104 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                 616           1912483 ns/op           35060 B/op        104 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                 483           3807073 ns/op           37531 B/op        153 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                 532           3495096 ns/op           37740 B/op        152 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                 362           4435523 ns/op           36668 B/op        156 allocs/op

goos: linux goarch: amd64 pkg: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/blobstor/blobovniczatree cpu: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz BenchmarkBlobovniczas_Put/tree=1x0-8 61 19747177 ns/op 33398 B/op 77 allocs/op BenchmarkBlobovniczas_Put/tree=10x0-8 60 18623677 ns/op 33600 B/op 77 allocs/op BenchmarkBlobovniczas_Put/tree=2x2-8 56 20861449 ns/op 36191 B/op 112 allocs/op BenchmarkBlobovniczas_Put/tree=4x4-8 43 25999988 ns/op 38511 B/op 182 allocs/op Signed-off-by: Leonard Lyubich <[email protected]>

cthulhu-rider · 2023-07-25T15:22:20Z

But it's intended for small objects, so I've made

yep, I got excited with 1M which is default for storage node. Changed to 4K and got similar results, so updated the test

And for everyone who loves Bolt's Batch()

personally, i still have not had the opportunity to test the benefits of native batching in Bolt in practice, but, according to the results, its benefits are doubtful in blobovnicza tree

I think we have a sufficient evidence base to try to optimize the data structure in the form of a single database with custom batching, huh? @roman-khimov @carpawell

roman-khimov · 2023-07-25T15:49:41Z

Want more fun? Add some threading into the mix:

@@ -66,28 +67,33 @@ func benchmarkPutMN(b *testing.B, depth, width uint64) {
                RawData: make([]byte, objSizeLimit),
        }
 
+       rand.Read(prm.RawData)
        b.ReportAllocs()
        b.ResetTimer()
 
-       var err error
+       var wg sync.WaitGroup
 
-       for i := 0; i < b.N; i++ {
-               b.StopTimer()
-               prm.Address = oidtest.Address()
-               rand.Read(prm.RawData)
-               b.StartTimer()
+       var f = func(prm common.PutPrm) {
+               var err error
+               for i := 0; i < b.N; i++ {
+                       prm.Address = oidtest.Address()
 
-               _, err = bbcz.Put(prm)
+                       _, err = bbcz.Put(prm)
 
-               b.StopTimer()
-               if err != nil {
-                       if errors.Is(err, common.ErrNoSpace) {
-                               break
+                       if err != nil {
+                               if errors.Is(err, common.ErrNoSpace) {
+                                       break
+                               }
+                               require.NoError(b, err)
                        }
-                       require.NoError(b, err)
                }
-               b.StartTimer()
+               wg.Done()
        }
+       for j := 0; j < 20; j++ {
+               wg.Add(1)
+               go f(prm)
+       }
+       wg.Wait()
 }
 
 func BenchmarkBlobovniczas_Put(b *testing.B) {

Batched:

BenchmarkBlobovniczas_Put/tree=1x0-8                  82          14652080 ns/op          527065 B/op        771 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  79          17694800 ns/op          521569 B/op        770 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  49          22042910 ns/op          504812 B/op        718 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 49          23533586 ns/op          498820 B/op        709 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 54          43626269 ns/op          503995 B/op        723 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 73          14796724 ns/op          515087 B/op        754 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  86          14203539 ns/op          589119 B/op       1528 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  81          14044176 ns/op          576998 B/op       1511 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  84          14373252 ns/op          581395 B/op       1520 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  68          15251037 ns/op          866981 B/op       3052 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  72          22075904 ns/op          881648 B/op       3063 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  61          37769824 ns/op          867537 B/op       3084 allocs/op

Non-batched:

BenchmarkBlobovniczas_Put/tree=1x0-8                  16          74108820 ns/op          680705 B/op       1510 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  14          74459352 ns/op          679138 B/op       1492 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  15          75338695 ns/op          674179 B/op       1489 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 15          77471599 ns/op          686840 B/op       1500 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 15          73161863 ns/op          676613 B/op       1504 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 15          73685846 ns/op          679125 B/op       1498 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  38          28185505 ns/op          711560 B/op       2115 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  37          29134521 ns/op          713372 B/op       2116 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  40          31421310 ns/op          714485 B/op       2128 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                 194           5928512 ns/op          927681 B/op       2907 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                 177           5974973 ns/op          927027 B/op       2909 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                 182           6380906 ns/op          925895 B/op       2908 allocs/op

100 threads batched:

BenchmarkBlobovniczas_Put/tree=1x0-8                  46          32179025 ns/op         2479720 B/op       3685 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  63          36400444 ns/op         2534475 B/op       3832 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                  38          39817475 ns/op         2443816 B/op       3571 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 32          38128756 ns/op         2335914 B/op       3434 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 34          36719362 ns/op         2328869 B/op       3380 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                 34          38328624 ns/op         2318948 B/op       3381 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  36          66034753 ns/op         2782671 B/op       7459 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  49          25038323 ns/op         2680942 B/op       6895 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                  50          24928362 ns/op         2631979 B/op       6858 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  34          30344009 ns/op         4119804 B/op      14370 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  44          23330084 ns/op         4158518 B/op      14287 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  44          56350572 ns/op         4269706 B/op      14784 allocs/op

and not:

BenchmarkBlobovniczas_Put/tree=1x0-8                   6         378786863 ns/op         3420346 B/op       7844 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                   4         387815959 ns/op         3413120 B/op       7665 allocs/op
BenchmarkBlobovniczas_Put/tree=1x0-8                   3         347173756 ns/op         3375656 B/op       7548 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                  3         382709638 ns/op         3385261 B/op       7535 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                  3         384898018 ns/op         3405813 B/op       7527 allocs/op
BenchmarkBlobovniczas_Put/tree=10x0-8                  4         375371866 ns/op         3444208 B/op       7691 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                   9         154033942 ns/op         3552957 B/op      10697 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                   7         154035856 ns/op         3545157 B/op      10573 allocs/op
BenchmarkBlobovniczas_Put/tree=2x2-8                   7         157911228 ns/op         3545810 B/op      10581 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  31          37806730 ns/op         4648195 B/op      24533 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  15          82339257 ns/op         4463610 B/op      19863 allocs/op
BenchmarkBlobovniczas_Put/tree=4x4-8                  60          21126481 ns/op         4325975 B/op      16357 allocs/op

cthulhu-rider · 2023-07-25T16:31:15Z

Want more fun?

not really, swing in one direction or the other

Add some threading into the mix

i can add parallelism (as option or not)

roman-khimov · 2023-07-25T17:32:35Z

not really, swing in one direction or the other

It's not in fact. It's very very consistent and predictable. But the point remains, bbcz tree adds zero value.

roman-khimov · 2023-07-25T18:39:00Z

We can add all modes, merge this one (just to save it for the future) and then take a 🔪 and do #2453.

goos: linux goarch: amd64 pkg: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/blobstor/blobovniczatree cpu: Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz BenchmarkBlobovniczas_Put/tree=1x0_parallel-8 50 25138722 ns/op 503770 B/op 718 allocs/op BenchmarkBlobovniczas_Put/tree=10x0_parallel-8 49 24535074 ns/op 502562 B/op 723 allocs/op BenchmarkBlobovniczas_Put/tree=2x2_parallel-8 30 53013230 ns/op 630769 B/op 1748 allocs/op BenchmarkBlobovniczas_Put/tree=4x4_parallel-8 19 54977576 ns/op 762231 B/op 3308 allocs/op Signed-off-by: Leonard Lyubich <[email protected]>

cthulhu-rider · 2023-07-26T14:34:12Z

added parallel runs d22c757

cthulhu-rider requested review from roman-khimov and carpawell as code owners July 25, 2023 14:08

cthulhu-rider force-pushed the bugfix/blobovnicza-tree-zero-depth branch from 2080185 to cc3fb47 Compare July 25, 2023 15:12

carpawell approved these changes Jul 26, 2023

View reviewed changes

roman-khimov approved these changes Jul 26, 2023

View reviewed changes

roman-khimov merged commit 782f658 into nspcc-dev:master Jul 26, 2023
6 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

blobovnicza: Add benchmark to test different tree settings #2457

blobovnicza: Add benchmark to test different tree settings #2457

cthulhu-rider commented Jul 25, 2023 •

edited

Loading

codecov bot commented Jul 25, 2023 •

edited

Loading

roman-khimov commented Jul 25, 2023

roman-khimov commented Jul 25, 2023

cthulhu-rider commented Jul 25, 2023

roman-khimov commented Jul 25, 2023

cthulhu-rider commented Jul 25, 2023

roman-khimov commented Jul 25, 2023

roman-khimov commented Jul 25, 2023

cthulhu-rider commented Jul 26, 2023

blobovnicza: Add benchmark to test different tree settings #2457

blobovnicza: Add benchmark to test different tree settings #2457

Conversation

cthulhu-rider commented Jul 25, 2023 • edited Loading

codecov bot commented Jul 25, 2023 • edited Loading

Codecov Report

roman-khimov commented Jul 25, 2023

roman-khimov commented Jul 25, 2023

cthulhu-rider commented Jul 25, 2023

roman-khimov commented Jul 25, 2023

cthulhu-rider commented Jul 25, 2023

roman-khimov commented Jul 25, 2023

roman-khimov commented Jul 25, 2023

cthulhu-rider commented Jul 26, 2023

cthulhu-rider commented Jul 25, 2023 •

edited

Loading

codecov bot commented Jul 25, 2023 •

edited

Loading